Introduction

This document was created during the 2020 NABs R Markdown workshop and will serve as an example workflow for my future use of R Markdown. This document will represent an exploratory analysis of macroinvertebrate metrics and environmental variable associated with Cazenovia Lake.

Workflow

  1. Open R Studio
  2. Create an R Project
  3. Add a data folder to the R project directory
  4. Create an R Markdown document
  5. Use Knit to compile the document.

Import Data

Import Zachary M. Smith’s thesis data provided by Zachary M. Smith at the R Markdown Crash Course Workshop on March 3rd, 2020.

thesis.df <- read.csv(file.path(here::here(),
                                "data",
                                "zms_thesis_metrics.csv"),
                      stringsAsFactors = FALSE)

Preprocessing

Load the tidyverse packages into the global environment.

library(tidyverse)
thesis.df <- thesis.df %>% 
  mutate(lake = case_when(
    lake %in% "caz" ~ "Cazenovia",
    lake %in% "onon" ~ "Onondaga",
    lake %in% "ot" ~ "Otisco",
    TRUE ~ "ERROR"
  ),
  lake = factor(lake, levels = c("Onondaga",
                                 "Otisco",
                                 "Cazenovia")))
thesis.df <- thesis.df %>% 
  filter(lake %in% params$lake)

For more details about the DT package visit https://rstudio.github.io/DT/.

library(DT)

datatable(thesis.df, options = list(scrollX = TRUE))

Study Area

For more details about the leaflet package visit https://rstudio.github.io/leaflet/.

library(leaflet)

pal <- colorFactor(c("#619Cff", "#F8766D", "#00BA38"),
                   domain = c("Cazenovia", "Onondaga", "Otisco"))

leaflet(data = thesis.df,
        options = leafletOptions(minZoom = 7,
                                 maxZoom = 13)) %>% 
  addTiles() %>% 
  addCircleMarkers(~long, ~lat,
                   fillOpacity = 0.75,
                   fillColor = ~pal(lake),
                   stroke = FALSE,
             popup = paste("Sample ID:", thesis.df$unique_id, "<br/>",
                           "Lake:", thesis.df$lake, "<br/>",
                           "Latitude:", thesis.df$lat, "<br/>",
                           "Longitude:", thesis.df$long)
             )

Plot

Scatter Plot

For more details about the plotly package visit https://plot.ly/ggplot2/.

library(plotly)

scatter.plot <- ggplot(thesis.df, aes(substrate_size_d50, pct_diptera)) +
  geom_point(aes(color = lake)) +
  geom_smooth(method = "lm")

ggplotly(scatter.plot)

Boxplot

ggplot(thesis.df, aes(lake, richness, fill = lake)) +
  geom_boxplot()

Conclusions

  1. The relative abundance of Diptera taxa in Cazenovia Lake ranged from 30.1775148 to 79.8319328.
  2. The median taxonomic richness observed in Cazenovia Lake was 22.5.